gpt4 book ai didi

r - 如何从 data.frame 中的列获取最大值并获取所有记录

转载 作者:行者123 更新时间:2023-12-04 11:34:20 45 4
gpt4 key购买 nike

我有一个 data.frame,我希望获得包含给定列 Total 最大值的行

Txn_date  Cust_no   Acct_no cust_type Credit Debit Total
09DEC2013 17382 601298644 I 1500 0 1500
16DEC2013 17382 601298644 I 500 0 500
17DEC2013 17382 601298644 I 0 60 60
18DEC2013 17382 601298644 I 0 200 200
19DEC2013 17382 601298644 I 1500 0 1500
20DEC2013 17382 601298644 I 0 60 60
20DEC2013 17382 601298644 I 0 103 103
30DEC2013 17382 601298644 I 500 0 500

因此,我编写了一个简单的 SQL 查询,使用 sqldf() 进行解析,如下所示:

s1<-paste("SELECT Txn_date, Cust_no,Credit,Debit,Total,max(Total) as 'MaxTxnAmt' FROM sample GROUP BY Cust_no")
sample_t1<-sqldf(s1)

这给了我

Txn_date Cust_no   Acct_no cust_type Credit Debit Total
09DEC2013 17382 601298644 I 1500 0 1500

如果我使用 base - R 函数,我会得到如上所示的准确输出:

sample_t1<-do.call(rbind,
lapply(split(sample,sample$Cust_no),
function(data) data[which.max(data$Total),]))

我想知道,如何从 sample 表中获取列 Total 的最大值的所有行。

期望的输出:

Txn_date Cust_no   Acct_no cust_type Credit Debit Total
09DEC2013 17382 601298644 I 1500 0 1500
19DEC2013 17382 601298644 I 1500 0 1500

示例数据:

sample <- structure(list(Txn_date = c("09DEC2013", "16DEC2013", "17DEC2013", 
"18DEC2013", "19DEC2013", "20DEC2013", "20DEC2013", "30DEC2013"
), Cust_no = c(17382L, 17382L, 17382L, 17382L, 17382L, 17382L,
17382L, 17382L), Acct_no = c("601298644", "601298644", "601298644",
"601298644", "601298644", "601298644", "601298644", "601298644"
), cust_type = c("I", "I", "I", "I", "I", "I", "I", "I"), Credit = c(1500,
500, 0, 0, 1500, 0, 0, 500), Debit = c(0, 0, 60, 200, 0, 60,
103, 0), Total = c(1500, 500, 60, 200, 1500, 60, 103, 500)), .Names = c("Txn_date",
"Cust_no", "Acct_no", "cust_type", "Credit", "Debit", "Total"
), row.names = c(16303L, 29153L, 31174L, 33179L, 35388L, 38750L,
38751L, 53052L), class = "data.frame")

最佳答案

尝试

library(dplyr)
sample %>%
group_by(Cust_no) %>%
filter( Total==max(Total))
# Txn_date Cust_no Acct_no cust_type Credit Debit Total
#1 09DEC2013 17382 601298644 I 1500 0 1500
#2 19DEC2013 17382 601298644 I 1500 0 1500

或者

library(data.table)
setDT(sample)[, .SD[Total==max(Total)] ,Cust_no]
# Cust_no Txn_date Acct_no cust_type Credit Debit Total
#1: 17382 09DEC2013 601298644 I 1500 0 1500
#2: 17382 19DEC2013 601298644 I 1500 0 1500

或者

setkey(setDT(sample), Total)[J(max(Total)), .SD,Cust_no]
# Cust_no Txn_date Acct_no cust_type Credit Debit Total
#1: 17382 09DEC2013 601298644 I 1500 0 1500
#2: 17382 19DEC2013 601298644 I 1500 0 1500

或者使用base R

sample[with(sample, ave(Total, Cust_no, FUN=max)==Total),]
# Txn_date Cust_no Acct_no cust_type Credit Debit Total
#1 09DEC2013 17382 601298644 I 1500 0 1500
#5 19DEC2013 17382 601298644 I 1500 0 1500

关于r - 如何从 data.frame 中的列获取最大值并获取所有记录,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30728640/

45 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com